Quantum corrections to the magnetoconductivity of surface states in three-dimensional topological insulators

The interplay between quantum interference, electron-electron interaction (EEI), and disorder is one of the central themes of condensed matter physics. Such interplay can cause high-order magnetoconductance (MC) corrections in semiconductors with weak spin-orbit coupling (SOC). However, it remains unexplored how the magnetotransport properties are modified by the high-order quantum corrections in the electron systems of symplectic symmetry class, which include topological insulators (TIs), Weyl semimetals, graphene with negligible intervalley scattering, and semiconductors with strong SOC. Here, we extend the theory of quantum conductance corrections to two-dimensional (2D) electron systems with the symplectic symmetry, and study experimentally such physics with dual-gated TI devices in which the transport is dominated by highly tunable surface states. We find that the MC can be enhanced significantly by the second-order interference and the EEI effects, in contrast to the suppression of MC for the systems with orthogonal symmetry. Our work reveals that detailed MC analysis can provide deep insights into the complex electronic processes in TIs, such as the screening and dephasing effects of localized charge puddles, as well as the related particle-hole asymmetry.

Theory of quantum corrections to magnetoconductance in 2D systems of symplectic symmetry The surface state of topological insulators (TIs) is a metallic state with strong spin-orbit interaction. From the symmetry point of view, this disordered metal belongs to the symplectic symmetry class. The topological protection is associated with the time-reversal symmetry T that squares to minus one: T 2 = −1. The external magnetic field breaks the time-reversal symmetry, driving the system to the unitary class. The magnetoresistance of the topological insulator (TI) can therefore be understood as a crossover between the symplectic and the unitary ensembles. The one-loop description of such a transition was derived in Ref. S1 . The computation performed by Hikami, Larkin, and Nagaoka (HLN) took into account the first-order correction to the conductance in 1/g, the inverse dimensionless conductance. For metals with moderate values of dimensionless conductance, the subleading terms play a substantial role. Here we show how to extend the HLN result to account for the next subleading term. We first show how this in the context of non-topological metals in the symplectic class and then we apply the same ideas for TIs.
In the absence of interactions, the conductance of a disordered metal is governed by one-parameter scaling theory S1 .
It can be described by the renormalization group (RG) equation: where t is the dimensionless coupling constant, inversely proportional to g. The RG flow occurs with a running scale L which is cut off by the physical spatial scale restricting the quantum interference. The diffusion motion of electrons starts at the transport scattering length l tr that serves as the microscopic scale for the theory.

A. Magnetoconductance in non-topological symplectic metals
We consider a two-dimensional disordered metallic film formed by spinful electrons. On the level of Drude formula, the conductance (summed over both spin components) is given by Here we have introduced the bare value of dimensionless conductance where l tr = v F τ tr with τ tr the transport scattering time and v F the Fermi velocity. The Fermi momentum p F is related to the total density of electrons n by the standard formula (both spins included, hence an overall factor of 2) To account for the localization effects, one uses the one-parameter scaling theory 2 , Eq. (S1). The β-function is determined by the symmetry class, controlled by the ratio between several length scales. The magnetic field suppresses quantum interference for distances greater than magnetic length l B = /(eB). The effects of spin-orbit are characterized by l so . The coherence in the sample is destroyed by dephasing at length l ϕ . The RG flow terminates by the sample size L sample when it is smaller than l ϕ .
Within the two-loop approximation, the β-function reads S2,S3,S4 Here is the dimensionless resistance, where g is the total dimensionless conductance of spinful electrons.
For the unitary case, the one-loop localization correction is absent. Therefore, the MC starts with terms proportional to 1/g: Here we have separated the contributions to the logarithmic correction coming from the singlet channel (with the coefficient 1) and from the triplet channels (with the coefficient 3). Since in the unitary class the Cooperons are suppressed by the magnetic field, the two-loop correction stems solely from diagrams with the diffuson modes.
For the orthogonal case, the two-loop correction vanishes. This implies that the two-loop contributions coming from the diffuson modes are canceled by the contributions involving Cooperons. Because we know the value of the diffuson contribution from the unitary-class calculation, Eq. (S7), we can now describe the crossover between the orthogonal and unitary ensembles S5 , by adding and subtracting the unitary-class term to the one-loop orthogonal-class correction: g(L) = g (0) + 1 − 3 π ln L l tr − 1 + 3 2π 2 g (0) ln L l tr + 1 + 3 2π 2 g (0) ln L l tr .
Here, the first and the last logarithmic terms on the right-hand side are associated with Cooperons and hence are suppressed in the strong magnetic field, leading then to Eq. (S7). Moreover, from this expression, one can infer the results for the symplectic class by suppressing the contributions associated with the triplet channel [terms with the coefficient 3 in Eq. (S8)]: Now we can unite Eqs. (S9), (S8), and (S7). To do that, we bear in mind that to describe the crossover between the symplectic and other classes, one should cut off the logarithms emerging in the triplet channels by l so . In addition, the logarithmic corrections coming from the diagrams involving Cooperons are also cut off by a magnetic field in both the singlet and triplet channels. As opposed to this, the logarithms arising from the diffuson diagrams are insensitive to l B . Finally, all logarithmic corrections must be cut off by the dephasing length l ϕ . Therefore, the crossover between all possible ensembles is accounted by To the zeroth order in the dimensionless conductance, this reproduces the HLN result. In addition, Eq. (S10) accounts for the terms that are proportional to the first order in 1/g. In particular, for the regime l B < l ϕ one finds Casting this in the form we find the prefactor for a two-dimensional non-topological symplectic metal. Note that this signifies that the value of α in units of 1/2 has a systematic correction in the limit of clean metals (g 1). This correction increases the values of α for metals in the symplectic symmetry class. We now proceed and apply the same logic for the metallic states on the surface of a time-reversal topological insulator.
In TIs, the spin-orbit interaction is strong, leading to the "spin-momentum locking". As a result, the effective spin-orbit length is the shortest length scale, l so < l tr . The transport is carried by two parallel surfaces, each one hosting Dirac electrons. Therefore, it makes sense to describe the problem in terms of the one-surface conductivity σ 1 . The total conductivity of the TI is a sum of the two contributions; for equivalent surfaces, we have just a factor of two: The bare value of electric conductivity at one surface is given by Drude formula: 1 .
Here the dimensionless conductance of one surface Note that the bare conductivity of a single surface of TI coincides with the conductivity of one spin component of a non-topological metal.
In the absence of an external magnetic field, each surface of TI belongs to a symplectic ensemble. For the strong magnetic field, the system is driven to the unitary ensemble. Therefore, the magnetoresistance can be again described in terms of the crossover between these two ensembles. The RG equation written in terms of the coupling t remains unchanged, see Eq. (S5), but, because of the spin-momentum locking, the relation between t and conductance needs to be reformulated. Specifically, in the unitary class, the coupling constant is related to the dimensionless conductance of a single surface as Here, the factor of 2 accounts for the contributions of the top and bottom surfaces to the total conductance. In the symplectic class, however, Solving the RG equation in the unitary case one finds Note that the triplet channel here is absent and the correction is fully governed by the singlet channel. For the symplectic case the RG equation reads.
The magnetoconductance (MC) of a surface is At finite magnetic field, and applying the same logic as we used for non-topological metals, we find a surface MC In the regime l B < L the total dimensionless conductance of the sample is given by Casting this result as one establishes In fact, this value of the prefactor α can be used not only in the logarithmic dependence of the magnetoconductivity, but also in a more refined Hikami-Larkin-Nagaoka formula, which describes the crossover at l B ∼ l ϕ , as used in the main text (cf. Ref. S5 ).

Particle-hole channel
Coulomb interaction modifies the RG flow of two-dimensional disordered Fermi liquid S6 . In particular, this results in the logarithmically divergent corrections to the dc conductivity, known as Altshuler-Aronov (AA) effect S7 . The inclusion of the Cooperon diagrams into the AA effect S5,S7 leads to the interaction-induced magnetoresistance. In the orthogonal symmetry class the problem was studied in Refs. S5,S8 . It was shown that the combined effect of the Coulomb interaction and weak localization lead to the enhancement of conductivity at low magnetic field. In the low-temperature (relatively high B) limit (l T l B l tr ) one finds S5 : Here l T = D/k B T . For weaker fields (l ϕ l B l T ), the numerical prefactor in this correction changes by a factor of 3/2 S8 .
In this section, we account for analogous effects in TIs. To do that, we use the eigenfunctions of the symplectic model, which are the spinors (with s = ± describing the branch of the spectrum) parametrized by the angle of the 2D momentum ϕ k . For not too strong disorder, we neglect the processes that involve the states from the − brach and focus entirely on the + electrons.
We consider the short-range Gaussian disorder, Here the overbar · · · stands for averaging over disorder realizations, and V 0 quantifies the strength of the scattering potential. The scattering matrix element between the states +, k and +, k is given by S9 k , +|V |k, and θ = ϕ k − ϕ k is the scattering angle.
Let us compute the interaction-induced correction to MC. The diagrams in the particle-hole channel are shown in Fig. S1. These are essentially Althsuler-Aronov diagram with an insertion of the Cooperon ladder. The latter is sensitive to the external magnetic field, that induces the magneto-dependence of the conductivity.
To evaluate these diagrams, we first analyze the diffuson ladder and Cooperon ladders. The construction of these ladders involves two different probabilities W andW . The first arises if the impurity line is inserted between the pointing in the opposite directions: Defining the angles between fast and slow momenta Considering the ladder that starts with a single disorder line, one obtains a Dyson equation for the diffuson propagator S9 : Here In the diffusive limit of small frequencies and momenta are then written as: D(ϕ, ϕ ) 1 2πντ 1 τ 1 + iql(cos ϕ + cos ϕ ) + (q 2 l 2 − 2iωτ ) cos ϕ cos ϕ Dq 2 − iω + 2 sin ϕ sin ϕ , where ν is the density of states and the diffusion constant is given by D = v 2 F τ tr /2 = v 2 F τ (note that in this model, for point-like impurities, the transport scattering time τ tr is twice larger than the total scattering time τ ).
The Cooperon ladder (that also starts with a single line) obeys the equation The resulting Cooperon propagator acquires an additional phase factor compared to the diffuson propagator (S33): In the absence of dephasing,C (ϕ, ϕ ) = D(ϕ, ϕ ).
We now use the propagators (S33) and (S36) to compute the interaction-induced corrections to the conductivity depicted in Fig. S1. The diagrams shown in Fig. S1 contain Hikami boxes, i.e., the insertions of the disorder line between two parallel retarded or advanced Green functions. The direction of the arrows of the Green functions gives rise to W andW factors. The phase difference of these factors matches the difference between Cooperon and diffuson ladders. As a result, the relative phase factors is completely cancelled. Therefore, in terms ofC, each diagram is identical to those in Ref. S8 , but with an opposite sign that emerges from the π shift in the directions of the incoming and outgoing momenta. Therefore, the contribution of all diagrams in Fig. S1 for the dimensionless conductance of a single surface at In the presence of external gate placed at the distance d from the sample, the lorarithm in Eqs. (S38) and (S39) is cut by the minimum of l B and d.

Particle-particle (Cooper) channel
In addition to the interaction in the particle-hole channel discussed above, the interaction in the particle-particle channel (i.e., the Cooper channel) affects the conductivity. On the perturbative level, the dimensionless magnetoconductance g 1 (B, T ) acquires corrections governed by the coupling constant γ c that accounts for a Cooper ladder of original interactions lines. In the case of attractive interaction, this Cooper ladder describes superconducting fluctuations. For the repulsive interaction, it does not lead to superconducting instability, but nevertheless affects the conductance.
The full system of the RG equations that account for the Cooper channel was derived in Ref. S6 (see also Ref. S10 ).
In our analysis, we go up to the g −1 order, accounting for the effects of the second-loop WAL in Sections I A & I B and interaction-induced magnetoresistivity in the partcle-hole channel studied in Section I C 1. We introduce the parameter of the running scale y = log(L/l tr ). In this approximation, the one-surface dimensionless conductance (for L < l B < l T ) for the long-range Coulomb repulsion is governed by the equations The first term on the right hand side of Eq. S40 comes from non-interacting one-loop RG, in agreement with the HLN formula. The second term is a combination of two-loop non-interacting correction, Eq. (S21), and the cross-term accounting the interaction in the particle-hole channel, Eq.(S38). The last term is correction coming from particleparticle channel, which is commonly referred to as the DOS correction in the Cooper channel. The first term in Eq. (S41) accounts for the standard Cooper ladder of interactions (which describes the conventional superconducting instability at the mean-field level for attractive interaction, γ c < 0). The second term in Eq. (S41) amounts for the renormalization of the coupling constant γ c due to the interplay of disorder and Coulomb interaction S8 . When the bare Cooper coupling is dominated by attraction, this term leads to the suppression of superconductivity in disordered systems.
Near the weak coupling limit (g 1 1 and γ c 1), one can solve equations (S40) and (S41) separately. With this accuracy, one finds 1 is the value of the Cooper interaction constant at the ultraviolet limit L ∼ l tr , with T c ∼ E F 1/τ . When the Fermi energy E F is not too large, it is more convenient to express the initial condition for the RG flow through the Drude conductance: γ c (0) ∼ log −1 (g) 1.
In the weak-coupling limit, where both RG charges 1/g and γ c are small, we infer from the Eq. S42 that γ c remains small, as long as γ c (0)g 1 ln(l B /l tr ). One notes that in the infrared limit (for y → ∞) this interaction constant flows This is a universal value for the long-range interaction S8 . Substituting Eq. (S42) into Eq. (S40), one finds the magnetoconductance of one surface (for l B < l T ) In the weak coupling limit Eq. (S44) yields In this analysis, we keep track of logarithmic terms and ignore non-singular corrections, even if they depend on B.
The dimensionless magnetoconductance of a single surface is thus given by Next we analyze the magnetoconductance in a weak magnetic field (l T < l B ). This implies a two-step RG procedure.
The renormalization for the running scales L < l T is followed by the renormalization with a modified beta-function at l T < L < l B . The first part of this procedure (L < l T ) is described by Eqs. (S40) and (S41). At l = l T the flow of the interaction constant in the superconducting channel stops, determining the value of γ c = γ c (y T ) for longer scales.
Therefore, terms that are linear in γ c no do not depend on l B , and do not contribute to magneto-conductance. In addition, the contribution of the interaction-induced magnetoconductance changes, see Eq. (S39). As a result, the second stage of the RG procedure (l T < L < l B ) is performed with following RG equation: resulting in The corresponding magnetoconductance for l T < l B < l ϕ then reads: In addition, there are real transitions that are influenced by magnetic field and thus affect the MC. They are not accounted in the above RG procedure and should be added separately. These are the Maki-Thompson correction S11,S12,S13 and dephasing processes. The Maki-Thompson corrections is quadratic in the amplitude γ c and reads: The dephasing time is affected by the magnetic field, see Eq. S54 below. This influences the HLN term in Eq. S44 and thus the MC.

Dependence of the dephasing time on magnetic field
The standard HLN term in Eq. S44, was computed so far with a logarithmic accuracy. It is known that a more careful computation yields Here, Ω B = 4DeB/ . The dephasing rate in the presence of the magnetic field is given by which induces an additional dependence of the HLN term on B in the range T Ω B 1/τ ϕ : The fist term corresponds to the standard one-loop HLN formula, while the second term generates an additional B-dependent contribution to the conductance. Though it is typically small, it is proportional to 1/g and, therefore, needs to be included in our analysis. This correction should be added to the MC in the regime l B > l T .

D. Summary of theoretical results
We now summarize the final formulas derived above for the total two-surface MC. In the absence of external gates, the MC reads In the presence of the gate, placed at distance d from the sample, the CWAL term does not produce a logarithmic dependence on magnetic field for l B > d. In this case, the equations above are modified as follows: the main part of the manuscript, assuming interaction constant is close to the universal limit, This is justified if the bare value at ultraviolet limit γ c (0) is large enough, such that the RG flow comes close to the universal limit at scale of the order d.

Evaluation of the effective aspect ratio with finite element analysis
As shown in Fig. S2, the device geometry used in this work is not the standard Hall bars. The irregular sample shapes, as well as the overlap between metal electrodes and TI flakes, preclude a straightforward conversion of the four-point resistance R xx to the sheet longitudinal conductivity σ xx . We apply the finite-element analysis to simulate the electric potential distribution S14 , which serves as a basis for reliable evaluation of σ xx .
The task of the numerical simulation is to find the geometrical factor f s (i.e., number of equivalent squares for the quasi-2D transport) for each four-point resistance measurement. For instance, when a current is applied between the source and drain (denoted as electrodes 5 and 6 in Fig. S3), the voltage measured between electrodes 1 and 2 determines the four-point resistance R 56,12 . The resistance per square follows In our devices the electrical contacts are only in touch with the bottom surface of the TI flakes, so there might be considerable asymmetry in the current flow and the electrical field distribution between the top and bottom surfaces.
The numerical analysis is further complicated by the irregular sample shape, ratio between the surface and bulk conductivities, as well as contact resistances and their local variations. Nevertheless, the simulation can be simplified by working out two extreme cases. One is the low bulk-conductivity limit, at which the bulk resistivity is assigned with a value exceeding the upper bound and the surface resistivity takes the minimum value observed in the experiment.
The other is the high bulk-conductivity limit, at which the bulk and surface resistivities are set at their lower and upper bounds, respectively. When the transport parameters are varied from the first limit (surface-dominating) toward the second one (bulk-dominating), the asymmetry between the top and bottom surfaces is expected to decrease. As will be shown below, these two limits exert a strong constraint on the uncertainty level of evaluating the geometrical factors. Fig. S3(a) depicts the electric potential distribution on the top and bottom surfaces of sample A at the limit of low bulk conductivity and high surface conductivity. The bulk resistivity is estimated to be 1-10 Ω·cm for our BSTS samples, so we use ρ xx = 20 Ω·cm in the finite element analysis in order to be on the safe side. The sheet resistivity of sample A is about 2-8 kΩ (or 4-16 kΩ each surface). The TI surface is simulated with a 1 nm thick layer with high 3D conductivity, but the sheet resistance is fixed at 4 kΩ per square for each surface. In contrast, the sheet resistivity of the bulk layer at this limit would be 20 Ω·cm/26 nm≈ 7.7 MΩ, three orders magnitude larger than the surface resistivity (∼2 kΩ, when the contributions from the top and bottom surfaces are included).
Shown in Fig. S3(b) is the electric potential distribution at the limit of high bulk conductivity and low surface conductivity. A bulk resistivity of 1 Ω·cm and a surface resistivity of 8 kΩ are used in the simulation. The overall electric field potential distribution, however, varies very little in comparison with that in the previous regime (i.e. low bulk/high surface conductivity, see Fig. S3(a)).
Here, we introduce a parameter γ to characterize the asymmetry between the top and bottom surfaces. It is defined as where V i,t and V i,b denote the electric potential on electrode i on the top and bottom surfaces, respectively. For sample A, γ is in a range of 0.02% to 0.18% at the low bulk conductivity limit, whereas γ is less than 0.01% for each of the electrode at the high bulk conductivity limit. The small γ values can be attributed to the quasi-2D nature of sample A, in which the TI flake is only 26 nm thick, nearly three orders of magnitude smaller than its lateral sizes.
Moreover, the bulk of BSTS material is not a perfect insulator, and this helps reduce the asymmetry between the two surfaces. As shown in Fig. S3c&d, the current distribution in the central area is almost uniform, only slightly influenced by the electrical contacts. We therefore conclude that the asymmetry effect can be largely neglected during the evaluation of the geometrical factor for longitudinal transport. The finite element analysis yields f s =0.61-0.63, very close to f s = 0.60 obtained by the straightforward geometrical consideration, in which the voltage probes are treated as ideal contacts, and thus do not modify the current flow.
When the magnetic field is applied, the above analysis will remain valid if the Hall angle, defined as θ H = ρ xy /ρ xx = µB, is much smaller than 1, where µ is the carrier mobility. As depicted in Fig. S4d, θ H is less than 0.07 at B = 0.5 T for all gate voltages. Beyond 0.5 T, the weak antilocalization effect is suppressed significantly by the magnetic field.
Since in this work the HLN fits are mostly restricted to the fields below 0.25 T, it is not necessary to consider the complications caused by the large Hall angles.
In principle, one could also use the finite element analysis to obtain the geometric factor for the Hall resistivity, defined as f H = 1/(n s qR H ). Here, n s is the sheet carrier density, q is the carrier's charge, and R H is the Hall coefficient.
The Hall voltages are, however, prone to the influences of metal electrodes, as they extend considerably into the interior of TI flakes. The uncertainty level of f H is expected to be much higher than its longitudinal counterpart. Fortunately, the carrier density can be extracted by utilizing the electrostatic field effect (i.e., the capacitive coupling), with the gating efficiency calibrated with quantum oscillations. As shown in    Fig. S4a) are selected to show the Hall resistance curves for magnetic fields up to 2 T (Fig. S4b)). The R xy curves exhibit good linearity for V BG < −30 V or V BG > −15 V. For the gate voltages in between, the Hall resistance deviates from the linear dependence. This can be ascribed to the formation of electron and hole puddles due to the long-range disorder. This type of transport is often referred to as the ambipolar regime in the literature.  For a 2D Dirac fermion system, the short-range scattering would lead to transport lifetime τ tr satisfying which leads to R xx independent of k F or carrier density. In contrast, the Coulomb interaction in 2D results in which is valid under the Thomas-Fermi approximation. In the latter case, one would expect R xx ∝ k −2 F ∝ n −1 . As depicted in Fig. S5, R xx is approximately proportional to n −0.57 if the electron density is not too low (i.e., outside of the ambipolar region). The momentum relaxation time is thus nearly independent of the carrier density, consistent with the τ tr values displayed in Table II. On one hand, such a carrier density dependence is different from graphene, which exhibits the typical behavior of 2D Coulomb potential S15 . On the other hand, it also differs greatly from the isotropic short-range scattering. This unique behavior is probably related to the long-range electric potential fluctuations induced by the compensation doping in the BSTS, which has been considered as a driving force for the formation of electron and hole puddles in the bulk.

A. Summary of MC corrections of various length scales
As discussed in Supplementary Note 1, the MC of TI surface states can be modified by several types of quantum corrections, including the first-and second-order interference effects, Maki-Thompson (MT) and the density of states (DOS) interaction corrections in the particle-particle channel, and the mutual effect of the WAL and electron-electron interaction (EEI) in the particle-hole channel (i.e., the cross-term correction, denoted as CWAL, which has a counterpart in an electron system with the orthogonal symmetry S5 ). In general, it is quite challenging to quantitatively account for the MCs in TIs because these corrections are related to multiple length scales (see Supplementary Note I and Table III states, opposite to all other quantum corrections mentioned above. For l B l ϕ , ∆σ CWAL follows a ln(B/B tr ) dependence, but with a coefficient varies from 3/(2πg) to 1/(πg) as the magnetic field increases from a regime of l B > l T to l B < l T (see Ref. S5 and Section I C 1). When a metallic gate is nearby, the cross-term correction is susceptible to the screening effect and starts to saturate at l B ∼ d (in other words, ∆σ CWAL gets suppressed for B < 2π /(ed 2 )). In this work, d refers to the distance between a TI surface and the metallic top-gate (a Cr-Au layer), since the bottom gate is a silicon substrate separated from the TI flake by a 300 nm thick SiO 2 layer.
Since the analysis of conductance corrections is often carried out with respect to the magnetic field, it is more convenient to discuss the relevant physics in a scale of magnetic field. Table III gives the definition of the characteristic B-fields corresponding to the aforementioned length scales, and when the magnetic field is sufficiently strong, so that l B < l ϕ and l B < l T are satisfied, and at the same time B is weak enough to stay diffusive (i.e., l B > l tr ), each of the MC corrections approaches a lnB-type dependence asymptotically (see Supplementary Note 1, SectionI D, and Ref. S5 ). Following the definitions of these length scales in Section I (summarized in Table III), the crossover conditions

B. Analysis of the MC corrections
Although the experimental MC often exhibits the lnB-like behavior in sufficiently high magnetic fields, it is very difficult to identify the sources of quantum corrections by carrying out a logarithmic fit. This can be attributed to the following two reasons: 1) the magnetic field range is too narrow or non-existent to perform a meaningful fit, if both B 2πB T (equivalent to l B l T ) and B B tr are satisfied with rigor, as shown in Table IV; 2) the MC curves at B B T usually don't contain sufficient curvatures to allow for a reliable extraction of over two fitting parameters, and hence insufficient to differentiate so many different sources of conductance corrections.
In order to disentangle various types of MC corrections, we fit the MC data in low magnetic fields to Eq. (1), typically in a range of B = 0-15B ϕ . This field range satisfies B B tr for the TI samples studied in this work (see Table IV). Another benefit of this fitting range is that the MC curves possess adequate curvatures to obtain the fitting parameters (α and B ϕ ) reliably. More importantly, the MC curves follow or only slightly deviates from the Eq. (1) in such low magnetic fields. Given the fact that MC corrections from the first-and second-order quantum interference effects are well described with Eq. (1) (see below for further justification), the total contribution of various EEI effects can be approximately casted into a form of the Y (B/B ϕ )-function (defined as Y (x) = ψ(1/2 + 1/x) + ln x)), at least for B < 15B ϕ . Since none of the MT, DOS and cross-term corrections take a form of Y (B/B ϕ )-function rigorously, the good agreement between the experimental MC curves and the HLN fits (simplified for low fields) is worth some discussion. As the mentioned in main text, the MT correction is proportional to the Y (B/B ϕ ) function when B 2πB T (see (Eq. (2)) in the main text). For a wider range of magnetic fields, the MT term can be written as S5 with G 0 = e 2 /(2π 2 ). This equation is similar to the MC correction due to the quantum interference effects: in which B tr replaces 2πB T in Eq. S63, and the prefactor satisfies α QI (1 + 1/(πg)) for the TI surface states, with 1 and 1/πg in the bracket originating from the first-and second-order quantum interferences, respectively. In Eqs. S63 and S64, the second term leads to saturation of the MC at high magnetic fields (B 2πB T or B B tr ). In weak magnetic fields (B 2πB T or B B tr ), these two terms are negligible and these two equations reduce to Eqs. (2) and (1) in the main text.
As shown in Table IV, the condition B B tr can be satisfied for the field range B = 0-15B ϕ , so the fits to Eq. (1) are justified for the conductance corrections caused by quantum interference. In contrast, the 2πB T value is comparable to the maximum magnetic field (B max = 15B ϕ ) for a typical fit. As a result, the B T term cannot be ignored and the magnitudes of MC are expected to be slightly smaller than those given by Eq.
(2). As depicted in Fig. S6, the B T term leads to nearly quadratic corrections to the MC in low magnetic fields. Its influence on the HLN fit is therefore quite weak for the fitting range 0-15B ϕ , but expected to be more substantial as the fitting range is expanded. This effect is indeed borne out in our experimental observation. As shown in Fig. 3c, the α values of electron side decrease slightly as the fitting range increases from 0-15B ϕ to 0-25B ϕ . Similar results are also obtained for fitting ranges 0-10B ϕ and 0-15B ϕ (Fig. S7). Despite the small variations in the α value for different fitting ranges, the overall gate-voltage dependences of α have some common characteristics, such as a pronounced maximum appearing near the charge neutral point, enhancement of α diminishing at large electron densities, and α values nearly independent of the fitting range on the hole side.
As pointed out in Ref. S5 , at higher magnetic fields (B 2πB T , or equivalently, Ω B 2πT ), the MT correction vanishes in an asymptotic form different from Eq. S63. However, this subtle difference is negligible, since the MT correction is overwhelmed by the DOS term under this circumstance. The DOS correction is quadratic and overwhelmed by the MT term in low magnetic fields. For B 2πB T , the DOS correction approaches Since the MC correction due to the MT and DOS corrections have the same sign, the decreasing MT contribution at large B is partially compensated by the DOS term, making the total Cooper-channel correction only slightly deviate from the single Y (B/B ϕ ) function (i.e., Eq. (2)). Nevertheless, we also performed the HLN fits of the DOS correction alone with different fitting ranges. It was found that increasing the field range leads to an increase in the α value, instead of the decreasing α observed in the experiment. We therefore conclude that the contribution of DOS term is marginal for the fitting range 0-15B ϕ .
As to the cross-term in the particle-hole channel, its correction to MC, ∆σ CWAL , is suppressed by the screening effect of the metallic top-gate. As shown in Table I, in sample A the top metallic gate is separated from the top and bottom surfaces by distances of d 1 = 54 nm and d 2 = 80 nm, respectively. They correspond to magnetic fields of about 0.23 T and 0.10 T, covering most of the field ranges of the HLN fits carried out in this work.
It is also worthy noting that the B ϕ and B T values in Table IV are given for the data taken at T = 1.6 K.
Nevertheless, the above statement remains valid for other temperatures, because both B ϕ and B T increase linearly with T and their ratio remains nearly constant. If the conductance g is not too small, the dephasing rate can be approximately written as which is independent of the temperature.
In this work, the g values are in a range of about 3-14, corresponding to 2πB T values roughly 10-33 times B ϕ (or 4B T ≈ 7-21B ϕ ). The degree of overlap between B > 4B T (equivalent to l B < l T ) and the HLN fitting range (typically B < B max = 15B ϕ ) is a function of lng/g and decreases with increasing g. The DOS correction is thus expected to vanish at the large g limit. On the other hand, the MC corrections caused by various EEI effects also diminish at g 1. As shown in Eq. S57, both the second-order interference and MT corrections, have a prefactor scaled with 1/g.The cross-term correction has a prefactor proportional to 1/g as well (see Eq. S39). Even if the screening by the metallic top-gate could not fully suppress the CWAL term, a 1/g-type dependence would also appear, similar to the case of orthogonal symmetry S5 . In summary, all of these corrections to the MC decrease monotonically with 1/g, providing a direct explanation of the nearly linear dependence of α on 1/g depicted in Fig. 4.

C. Irrelevance of the dephasing term
Finally, we remark on possible MC corrections of the dephasing term (i.e., the second term in Eq. S55). Although the magnetic field dependence of dephasing rate may contribute to the MC (Supplementary Note 1, Section I C 3), this correction requires that B 2πB T and B B ϕ to be satisfied simultaneously. As shown in Table IV, this condition cannot be met due to the modest g values encountered in this work. Therefore, this dephasing term can be dropped out in our discussion.
VI. SUPPLEMENTARY NOTE 6 Negligible influence of quadratic magnetoresistance background on the

HLN fits
The quadratic and linear magnetoresistances (MRs) have also been observed in topological insulators previously by many groups S16,S17,S18 . Among 3D topological insulators, the most pronounced linear MR was observed, in a very wide range of magnetic fields, in Bi 2 Te 3 S16,S18 . However, it was also shown that the linear MR crosses over to a parabolic dependence at sufficiently low fields (typically of the order 0.1 T), when the WAL effect is suppressed by raising the temperature to about 10 K S16 . For BSTS, no clear evidence has been reported so far for this type of linear MR, despite extensive studies (see, for instance, Refs. S19,S20 and Fig. S8). The quadratic MR is thus expected to exist in a wider range of magnetic fields in BSTS than Bi 2 Te 3 . In this work, we are mainly concerned with the MC corrections related to the WAL effect. The corresponding data analysis is limited to magnetic fields lower than 0.25 T. The low-field quadratic background can be obtained by fitting the MR data taken at high magnetic fields (4-9 T for sample A, see Fig. S8). After this background is subtracted, the low-field MR data is converted to the MCs and fitted to the HLN equation (or its generalization). Table V shows that the result of HLN fit is barely modified by the removal of quadratic background. Influence of asymmetry between the top and bottom surfaces on the

HLN fits
If the top and bottom surfaces are not equivalent, the analysis of MCs would be further complicated by such an asymmetry. Although a lot of efforts have been used to prevent the TI surface states from degradation during the device fabrication process, it is impossible to ensure that these two surfaces are perfectly identical in all aspects. At the temperatures of interest to this work, EEI is the main source of electron dephasing, and the dephasing time τ ϕ can be determined from the conductivity. The corresponding dephasing field is given by where l ϕ is the dephasing length, and the diffusion coefficient is given by D = 1 2 ν 2 F τ . If the bulk does not contribute, low field MC at g 1 limit can be written as where i =1,2 is the surface index. If the top and bottom surfaces are identical, one expects B ϕ,1 = B ϕ,2 , the above expression will reduce to Eq. (1) in the main text with α =1. When the two surfaces are not equivalent, the MC in principle cannot be described perfectly with Eq. (1). Nevertheless, it has been found by many studies that low field MC in many TI samples can be fitted very well with Eq. (1) by treating both α and are B ϕ as free parameters. The α values obtained from such fits are mostly smaller than 1. Fig. S9 shows that α decrease monotonically as the ratio B ϕ,1 /B ϕ,2 increases. When the asymmetry between the two surfaces is not too large, the correction to α is quite modest. For instance, α becomes 0.963 and 0.9045 for B ϕ,1 /B ϕ,2 =2 and 3, respectively. Moreover, the α value for the asymmetric case (B ϕ,1 /B ϕ,2 = 1) is always smaller than that for the symmetric case (B ϕ,1 /B ϕ,2 = 1). This statement remains valid when the second loop and MT corrections are included because they are described with the function of same length scales (l B and l ϕ ).

Evaluation of the Fermi velocity with the dephasing rate
Dephasing field B ϕ is the other important parameter extracted from the HLN fits. Figure 2d shows the temperature dependence of B ϕ for the five representative gate-voltage points. Each of them can be fitted to a linear function It is generally accepted at temperatures below 10 K, EEI is the dominant source of dephasing, and the dephasing time τ ϕ in a 2D system follows This leads to the linear temperature dependence of dephasing rate 1/τ ϕ , which is presumably responsible for the Fig. S10 depicts the dephasing rates calculated selfconsistently with Eq. S70 for the five gate-voltage points. Assuming b ϕ , the slope of each B ϕ -T curve in Fig. 2d, being solely contributed by the EEI-induced dephasing, we obtain All parameters on the right-hand side of the above equation can be determined by the transport measurements except v F . Using theoretical values of τ EEI ϕ , we could deduce The Fermi velocities of points D and E are evaluated to be 4.2 × 10 5 m/s and 4.0 × 10 5 m/s, respectively. V TG taken at T = 1.6 K and in a perpendicular field of 1 T. b, c Parallel field MC curves for five gate-voltage sets (marked as points A-E in panel a). The low field quadratic MC curves (panel b)suggest a transport process dominated by the surface states (points C, D and E, with the Fermi level near CNP or above), whereas the deviation from the quadratic form at points A and B suggests the participation of the bulk states. The quasilinear MC at high magnetic fields (B > 2 T, panel c) remains to be understood.